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DETAILED ACTION 



Response to Arguments 

1 . Applicant's arguments filed 04/20/10 have been fully considered but they are not 
persuasive. 

Applicant argues that neither Boys nor Yokota disclose or suggest voice 
recognition means for performing voice recognition on the audio data and generating 
text data and word-marking data, the work marking data indicating locations of word 
boundaries between spoken words with the audio data and linking words in the audio 
data to corresponding words in text data; . (Amendment, pages 9, and 10). 

The examiner disagrees, since Boys disclose "Select functions, such as by 
simple voice-recognition, wherein simple commands may be spoken to and 
recognized by the Audio Editor. The problems in general voice recognition also are 
far from trivial. ..a machine has a real problem determining where one word ends and 
another begins. A user may speak a word or a phrase, and the system will rapidly 
search the document for a data string to match the digital print of the spoken 
phrase, moving the pointer to the beginning of a data string that matches (moving 
the pointer to the beginning of a data string that matches is considered as indicating a 
word marking data; col. 2, lines 45 - 47; col.6, line 66-col.7, line 1 ; col. 14, lines 17 - 22). 
In a preferred embodiment input of machine-operable text code with the cursor in a 
voice region results in text being displayed in place of equivalent portions of the 
voice region (col.4, lines 34 - 38). 
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Applicant argues that neither Boys nor Yokota disclose or suggest the control 
means controlling the displaying on the display means of the stored text data that 
corresponds to the audio data being replayed, as indicated by the word-marking data 
(Amendment, pages 9, and 10). 

The examiner disagrees, since Boys discloses "A user may speak a word or a 
phrase, and the system will rapidly search the document for a data string to match 
the digital print of the spoken phrase, moving the pointer to the beginning of a 
data string that matches. In a preferred embodiment input of machine-operable text 
code with the cursor in a voice region results in text being displayed in place of 
equivalent portions of the voice region (col.4, lines 34 - 38; col. 14, lines 17 - 22). 

Claim Rejections - 35 USC § 101 

2. 35 U.S.C. 101 reads as follows: 

Whoever invents or discovers any new and useful process, machine, manufacture, or composition of 
matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the 
conditions and requirements of this title. 

Claims 17-20 are rejected under 35 U.S.C. 101 because the claimed invention is 
directed to non-statutory subject matter. Claims 17-20 are directed to a computer 
readable medium storing processor executable instructions that is not limited to a non- 
transitory, and thus, statutory medium. The scope of "computer-readable medium" since 
not defined in the specification can encompass signal-based mediums such as "signals 
used to propagate instructions", "carrier waves/pulses". A signal does not fall within one 
of the four statutory categories of invention (i.e., process, machine, manufacture, or 
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composition of matter) because it is an ephemeral, transient signal and thus is non- 
statutory. Since the scope of "computer-readable medium" can include these non- 
statutory instances, Claims 17-20 are directed to non-statutory subject matter. 



Claim Rejections - 35 USC § 103 

3. The text of those sections of Title 35, U.S. Code not included in this action can 
be found in a prior Office action. 

4. Claims 1 - 27 are rejected under 35 U.S.C. 1 03(a) as being unpatentable over 
Boys et al (US Patent 5,875,448) in view of Yokota et al., (EP 0597483). 

Regarding claims 1 and 8, Boys et al. discloses an arrangement for replaying 
stored audio data (see col. 3, line 50), the system comprising: 

voice recognition means for performing voice recognition ("voice-recognition") on 
the audio data and generating by the voice recognition means text data and word- 
marking data ("beginning of a data string"), the word-marking data indicating 
locations of word boundaries between spoken words within the audio data ("a data 
string to match the digital print of the spoken phrase, moving the pointer to the 
beginning of a data string that matches"; col.2, lines 45 - 47; col. 6, line 66-col.7, 
line 1; col. 14, lines 17 - 22), and linking words in the audio data to corresponding words 
in the text data ("with the cursor in a voice region results in text being displayed in 
place of equivalent portions of the voice region"; col.4, lines 34 - 38); 

memory means for storing the audio data and for storing the text data and the 
word-marking data obtained from performing voice recognition on the audio data ("end 
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of the file"; see col. 3, lines 48, 49; col.1 1 , lines 5-8; col.6, lines 65 - 67; col.4, lines 
12; and 34-38); 

display means for visually displaying the text data ("with the cursor in a voice 
region results in text being displayed in place of equivalent portions of the voice 
region"; col.4, lines 34 - 38). 

audio replaying means for replaying the audio acoustically in a forward 
sequence; and control means for controlling the replaying of stored audio data in a 
forward mode and in a reverse mode, the control means controlling the audio replaying 
means during a playback of the audio data in the reverse mode to perform a reverse 
mode playback operation including, starting from a replay position in the audio data ("a 
function called Return associated with Play moves the pointer immediately back to the 
position it held in the file at the beginning of the play function. The jog and Play 
functions are provided for a user to find positions in the file where additions, editing, or 
other functions are to be performed"col.13, lines 5-8, and 30 - 33; col. 11, lines 1 - 8); 

the control means controlling the displaying on the display means of the stored 
text data that corresponds to the audio data being replayed, as indicated by the word- 
marking data ("search the document for a data string to match the digital print of 
the spoken phrase, moving the pointer to the beginning of a data string that 
matches. In a preferred embodiment input of machine-operable text code with the 
cursor in a voice region results in text being displayed in place of equivalent 
portions of the voice region"; col.4, lines 34-38; col. 14, lines 17 - 22). 
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However, Boys et al do not specifically teach initiating a backward jump, counter 
to the forward sequence over a distance corresponding to a length of at least N words 
using the word boundaries indicated in the word-marking data, to a target position, and 
then, starting from the target position, the control means initiates a replay of K words of 
the audio data in the forward sequence using the word boundaries indicated in the 
word-marking data, wherein K is less than N, the control means further controlling the 
audio replaying means and the display means to automatically repeat performing the 
reverse mode playback operation while the system is in the reverse mode. 

Yokota et al., teach that hybrid playback is a combination of fast playback 
operations in cue and review modes. In this example, review playback is performed 
program by program, but cue playback is performed within each program. Most 
specifically, first the aforementioned cue playback is performed from the 
beginning of the 5 th program and after completion of the 5 th program, the 
playback jumps from the last data position of the 5 th program to the beginning of 
the 4th program, and the cue playback of the 4 th program is performed... Thereafter 
the above playback operation is advanced similarly for the next and subsequent 
programs (Performing cue playback in each program and jumping from the last data 
position of that program to the beginning of the next and subsequent program implies 
replaying of K words of the audio data in the forward sequence using the word 
boundaries indicated in the word-marking data, since backward jumping is based on the 
last data position of the program; col. 12, lines 3 - 20). 
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Therefore, it would have been obvious to one of ordinary skill in the art at the 
time the invention was made to use hybrid playback as taught by Yokota et al., in Boys 
et al., because that would provide an improved disc playback method which is capable 
of performing fast playback (col.1 , lines 41 - 44). 

Regarding claims 2 and 9, Yokota et al., further disclose repeating the reverse 
playback operation causes each of the K words on each repetition of the playback 
operation to be replayed acoustically in the forward sequence and in order counter to 
the forward sequence ("Most specifically, first the aforementioned cue playback is 
performed from the beginning of the 5 th program and after completion of the 5 th 
program, the playback jumps from the last data position of the 5 th program to the 
beginning of the 4th program, and the cue playback of the 4 th program is performed"; 
col.1 2, lines 3-20). 

Regarding claim 3, Boys et al. further disclose that a counting means is assigned 
to control means in order to count the marking data reached during backward jumping 
or replaying (see col. 11, lines 1 -8). 

Regarding claim 4, Boys et al. further disclose that a timing circuit is assigned to 
control means in order to calculate the duration of the audio replay (see col. 11, lines 
41-50). 
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Regarding claim 5, Boys et al. further disclose that setting means is connected to 
control means in order to set the speed of the audio replay (see col. 11, lines 41-50). 

Regarding claims 6 and 1 5, Boys et al. further disclose that the control means is 
further connected to text memory means for storing text data corresponding to the audio 
data (see col. 7, lines 44-49), which is connected to text display means (see col. 7, lines 
26-29), and wherein the control means is set up to initiate, by means of linkage data for 
the audio data and text data, a synchronous replaying of the audio data and the text 
data corresponding to it (see col. 12, lines 30-41, lines 52-67). 

Regarding claim 7, Boys et al. further disclose that the control means and the 
text memory means and the memory means for the audio data are connected to voice 
recognition means, which undertakes an automatic transcription of the audio data to 
generate the text data ("converted the recorded areas to text"; see col. 16, lines 35-42). 

Regarding claim 10, Boys et al. further disclose that replaying in the forward 
sequence is automatically terminated when the next word-marking data is reached 
during replaying (see col. 13, lines 1-8). 

Regarding claim 1 1 , Boys et al. further disclose that replaying in the forward 
sequence is automatically terminated after a specified period (see col. 13, lines 1-8). 
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Regarding claim 12, Boys et al. further disclose that termination of the replay in 
the forward sequence, a backward jump over a return distance corresponding to the 
length of at least roughly two words takes place automatically (see col. 13, lines 1-8). 

Regarding claim 13, Boys et al. et al. further disclose that the backward jump in 
the audio data is undertaken at a speed that is higher than the replay speed during 
replaying in the forward sequence, and without acoustic replaying of the stored audio 
data ("operates at faster than normal"; paragraph 12, lines 55 - 60). 

Regarding claim 14, Boys et al. et al. further disclose that the replaying of the 
stored audio data in the forward sequence takes place at an adjustable replay speed 
(see col. 11, lines 41-47). 

Regarding claim 16, Boys et al. et al. further disclose that during the visual 
displaying of multiple words of the text data, the particular visually displayed word for 
which the corresponding audio data is being replayed is visually highlighted (see col. 4, 
lines 51-58, where the cursor highlights the word). 

Regarding claim 1 7, Boys et al. et al. further disclose that the text data 
corresponding to audio data is obtained by means of an automatic voice recognition of 
the audio data, wherein, simultaneously, the word-marking data is generated and stored 
as linkage data for the text data and audio data that correspond with each other 
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("comparison can be made between the entered text and the voice-recorded" see col. 7, 
lines 36-50; col. 16, lines 35-48). 

Regarding claim 18, Boys et al. et al. further disclose that a computer program 
product that can be loaded into a memory of a computer, and which comprises sections 
of software code in order that, by means of their implementation following loading into 
the memory, the method as claimed in claim 8 can be implemented with the computer 
(see col. 16, lines 51-53). 

Regarding claim 19, Boys et al. et al. further disclose that a computer program 
product as claimed in claim 18, characterized in that it is stored on a computer-readable 
medium (see col. 16, lines 51-53). 

Regarding claim 20, Boys et al. et al. further disclose that a computer with a 
processing unit and an internal memory, which computer is designed to implement the 
computer program product as claimed in claim 18 (see col. 16 lines 51-53). 

As per claim 21 , Boys et al., teach an arrangement for replaying stored audio 
data comprising: 

a voice recognition system configured to perform voice recognition on the audio 
data and to generate text data and word-marking data ("beginning of a data string"), 
the word-working data indicating locations of word boundaries between spoken words 
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within the audio data ( "a data string to match the digital print of the spoken phrase, 
moving the pointer to the beginning of a data string that matches"; col.2, lines 45 
- 47; col. 6, line 66-col.7, line 1 ; col. 14, lines 17 - 22), and linking words in the audio 
data to corresponding words in the text data ("with the cursor in a voice region results 
in text being displayed in place of equivalent portions of the voice region "; col.4, 
lines 34 - 38); 

a memory configured to store the audio data and to store the text data and the 
word-marking data obtained from performing voice recognition on the audio data ("end 
of the file... location of the file"; see col. 3, lines 48, 49; col .11, lines 5-8; col. 6, lines 65 
-67; col.4, line 12); 

a display device configured to visually display the text data ("with the cursor in a 
voice region results in text being displayed in place of equivalent portions of the 
voice region"; col.4, lines 34 - 38); 

the controller further configured to display on the display device the text data that 
corresponds to the audio data being replayed, as indicated by the word-marking data 
("search the document for a data string to match the digital print of the spoken 
phrase, moving the pointer to the beginning of a data string that matches. In a 
preferred embodiment input of machine-operable text code with the cursor in a voice 
region results in text being displayed in place of equivalent portions of the voice 
region"; col.4, lines 34 - 38; col. 14, lines 17 - 22). 

Boys et al., do not specifically teach a controller configured to playback the audio 
data in a reverse mode by jumping back N words using the word boundaries indicated 
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in the word-marking data, playing back K words using the word boundaries indicated in 
the word-marking data, and then automatically repeating the jumping and playing back 
while in the reverse mode, wherein K is less than N. 

Yokota et al., teach that hybrid playback is a combination of fast playback 
operations in cue and review modes. In this example, review playback is performed 
program by program, but cue playback is performed within each program. Most 
specifically, first the aforementioned cue playback is performed from the 
beginning of the 5 th program and after completion of the 5 th program, the 
playback jumps from the last data position of the 5 th program to the beginning of 
the 4th program, and the cue playback of the 4 th program is performed... Thereafter 
the above playback operation is advanced similarly for the next and subsequent 
programs (Performing cue playback in each program and jumping from the last data 
position of that program to the beginning of the next and subsequent program implies 
replaying of K words of the audio data in the forward sequence using the word 
boundaries indicated in the word-marking data, since backward jumping is based on the 
last data position of the program; col. 12, lines 3 - 20). 

Therefore, it would have been obvious to one of ordinary skill in the art at the 
time the invention was made to use hybrid playback as taught by Yokota et al., in Boys 
et al., because that would provide an improved disc playback method which is capable 
of performing fast playback (col.1 , lines 41 - 44). 
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As per claim 22, Yokota et al., further suggest that N=2 and K=N-1 ("first the 
aforementioned cue playback is performed from the beginning of the 5 th program and 
after completion of the 5 th program, the playback jumps from the last data position of the 
5 th program to the beginning of the 4th program, and the cue playback of the 4 th 
program is performed"; col. 12, lines 3 - 20). 

As per claims 23, and 24, Yokota et al., further suggest that the controller is 
configured to skip playback of a number of the words so that only every fourth or fifth of 
the words is replayed; configured to skip playback of a number of the words so that only 
every predetermined number of the words is replayed ("skipping 8 sectors which 
correspond to four of a 2-sector unitary block"; col. 10, lines 42 - 48). 

As per claim 25, Yokota et al., further disclose playing back is for a 
predetermined duration after which the automatically repeating the jumping and the 
playing back are performed ("first the aforementioned cue playback is performed from 
the beginning of the 5 th program and after completion of the 5 th program, the playback 
jumps from the last data position of the 5 th program to the beginning of the 4th program, 
and the cue playback of the 4 th program is performed"; col. 12, lines 3 - 20). 

As per claim 26, Yokota et al., further disclose that the jumping back is for a 
return distance which is one of as estimated mean data duration of the N words and 
determined from a word-marking data associated with the audio data ("the playback 
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jumps from the last position of the 5 th program to the beginning of the 4th program" 
col. 12, lines 3-20). 

As per claim 27, Yokota et al., further disclose the playing back is terminated in 
response to reaching one of a word-marking data associated with an end of the Kth 
word and a predetermined replay time ("cue playback is performed from the beginning 
of the 5 th program and after completion of the 5 th program"; col. 12, lines 3 - 20). 

Conclusion 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to LEONARD SAINT CYR whose telephone number is 
(571 )272-4247. The examiner can normally be reached on Mon- Friday. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Richemond Dorvil can be reached on (571) 272-7602. The fax phone 
number for the organization where this application or proceeding is assigned is (571)- 
273-8300. 
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Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a 
USPTO Customer Service Representative or access to the automated information 
system, call 800-786-9199 (IN USA OR CANADA) or (571)-272-1000. 
LS 

07/03/10 

/Leonard Saint-Cyr/ 
Examiner, Art Unit 2626 



